Workshop: Tweet"Data Science on Hadoop"

Track: Workshops / Time: Wednesday 09:00 - 16:00 / Location: Rode Kamer

In this full day workshop on Data Science using Apache Hadoop, you will learn how to work with large data sets and extract meaningful information from them as well as applying machine learning models to build data driven functionality. You will work on a real world, substantially large data set on a full blown Hadoop cluster (running in the cloud).

We will start off with an introduction of the activities of a data scientist and some of the concepts that are involved. During the first part we will get hands-on with exploratory data analysis on a large data set using Apache Hadoop, Apache Spark and Python. In the second part we will create a full blown data science solution using a large data set and machine learning models.

This workshop focusses on getting hands-on with these subjects and not too much on theory.

Learning outcomes:

Understand the Data Science process
Basic use of some Data Science tools for Big (and smaller) Data
Basic use of Apache Hadoop and Apache Spark
Data visualisation for exploratory analysis
Basic knowledge of machine learning models

Target Audience
Software engineers who want to get hands-on with data science. Coding skills are required. No prior knowledge of data science or machine learning is expected. Some experience in Python is helpful, but not a necessity.

Technical Requirements
You need a laptop that allows SSH access to a server and has a web browser. Additionally, a text editor can come in handy.

Workshop is limited to 20 attendees.

Friso van Vollenhoven, TweetCTO of GoDataDriven

Biography: Friso van Vollenhoven

Friso is CTO of GoDataDriven. With a background in software engineering, he is currently active in the area that overlaps both systems and software engineering and applied, large scale, data processing. He is a long time Hadoop user, track chair of the Hadoop track at the GOTO conference in Amsterdam and also organiser of The Amsterdam Applied Machine Learning meetup group and the Dutch Hadoop User Group.

Twitter: @fzk

Ivo Everts, TweetData Driver at GoDataDriven

Biography: Ivo Everts

Ivo is a Data Driver at GoDataDriven, where he works on data science and machine learning solutions. He holds a PhD on image processing and machine learning. After his PhD research Ivo joined GoDataDriven to apply his skills and solve real problems for industrial and corporate clients. He is involved in anything related to data science, and is passionate about knowledge sharing.

If you want to go data driven, you need a good data driver.